Selection and Validation of Reference Genes for RT-qPCR Analysis of Gene Expression in Nicotiana benthamiana upon Single Infections by 11 Positive-Sense Single-Stranded RNA Viruses from Four Genera

Quantitative real-time PCR (RT-qPCR) is a widely used method for studying alterations in gene expression upon infections caused by diverse pathogens such as viruses. Positive-sense single-stranded (ss(+)) RNA viruses form a major part of all known plant viruses, and some of them are damaging pathogens of agriculturally important crops. Analysis of gene expression following infection by ss(+) RNA viruses is crucial for the identification of potential anti-viral factors. However, viral infections are known to globally affect gene expression and therefore selection and validation of reference genes for RT-qPCR is particularly important. In this study, the expression of commonly used reference genes for RT-qPCR was studied in Nicotiana benthamiana following single infection by 11 ss(+) RNA viruses, including five tobamoviruses, four potyviruses, one potexvirus and one polerovirus. Stability of gene expression was analyzed in parallel by four commonly used algorithms: geNorm, NormFinder, BestKeeper, and Delta CT, and RefFinder was finally used to summarize all the data. The most stably expressed reference genes differed significantly among the viruses, even when those viruses were from the same genus. Our study highlights the importance of the selection and validation of reference genes upon different viral infections.


Introduction
Infections of positive-sense single-stranded RNA (ss(+) RNA) viruses are major threats to agricultural production [1]. To systemically infect plant cells, ss(+) RNA viruses cooperate with host factors to manipulate host genes, thereby establishing a suitable microenvironment for viral replication and trafficking. Therefore, analysis of gene expression upon viral infection is critical for the identification of potential anti-viral factors. To study virus-host interactions, fluorescence-based quantitative real-time PCR (RT-qPCR) is often used to quantify the transcript levels of target genes [2,3]. The use of qPCR to accurately quantify transcript abundance is largely influenced by the amount of initiating material, RNA quality, reverse transcriptase efficiency, amplification efficiency, etc. [4]. To normalize these differences, the use of stably expressed reference genes is required [5]. The internal reference gene is an internal reaction control whose sequence is different from the target gene. To serve as a reliable reference for RT-qPCR a gene must meet several important criteria, of which the most important is that its expression level is not affected by any of the experimental treatments [6]. In addition, its expression should vary minimally in different tissues as well as under different physiological conditions. Actin, GAPDH and EF1α play critical roles in life activities, and their expression is largely independent of external factors such as abiotic stresses [7]. Therefore, these genes are widely used as references in RT-qPCR to analyze gene expression [8]. In Nicotiana benthamiana (N. benthamiana), the stability of several reference genes has been tested in developmentally distinct tissues and during abiotic stresses [9]. Moreover, previous studies have also analyzed the stability of certain reference genes during infection with a variety of viruses, including necroviruses, benyviruses, hordeiviruses, potexviruses, and tobacco rattle virus [10,11]. However, the stability of expression of most reference genes in host plants during infection with most of the reported plant viruses has not been well studied.
It is known that the expression of some genes that can be used as references under certain conditions may show significantly altered expression under other conditions [7]. To systemically infect hosts, viruses replicate and assemble in host cells using host components to produce new progeny viruses and initiate a new round of life cycle [12]. In this process, plant viruses typically interfere with and engage the host's cellular pathways, and a large portion of the most widely used reference genes are components of these cellular pathways [13]. In addition, different viral infestations may affect different pathways, and the extent to which these pathways are affected varies depending on the time of viral infestation and the type of cells infected. Some ss(+) RNA viruses may hijack host metabolic enzymes and housekeeping proteins to facilitate their replication. For instance, tomato bushy stunt virus (TBSV) can bind GAPDH, a commonly used reference gene that plays critical roles in glycolysis [14], membrane fusion, vascular bundles, nuclear RNA output, DNA replication, DNA repair [15] and RNA stability [16]. By binding GAPDH to the 3 untranslated region (UTR) of the TBSV negative-sense strand during replication the synthesis of the positive-sense chain is stimulated by increasing the stability of template [17,18]. However, the stability of GAPDH expression during TBSV infection has not been carefully analyzed in many host plants. Another example is that tobacco mosaic virus (TMV) can bind eEF1A, a highly conserved and abundant housekeeping protein in eukaryotes [19], to maintain the activity of viral RdRp [20]. Moreover, TMV is known to induce the expression of EF1A [21]. Therefore, it is particularly important to select and validate RT-qPCR reference genes to analyze the expression of host genes upon viral infection [22].
Owing to its ability to be infected by a wide variety of plant viruses, N. benthamiana has been widely used as an experimental model for studying plant-virus interactions [23]. Plant virus vector-based expression systems, the construction of virus-induced gene silencing (VIGS) systems, and agrobacterium infiltration are all developed using N. benthamiana plants [24]. In this study, N. benthamiana infection was therefore used as a model to analyze the stability of commonly used reference genes in the context of infections caused by 11 ss(+) RNA viruses, including five tobamoviruses (TMV, tomato brown rugose fruit virus (ToBRFV), pepper mild mottle virus (PMMoV), tobacco mild green mosaic virus (TMGMV) and tomato mottle mosaic virus (ToMMV)), four potyviruses (turnip mosaic virus (TuMV), chilli ringspot virus (ChiRSV), chilli veinal mottle virus (ChiVMV) and pepper mottle virus (PePMoV)), one potexvirus (potato virus X (PVX)) and one polerovirus (pod pepper vein yellows virus (PoPeVYV)). Five commonly used algorithms, geNorm, NormFinder, BestKeeper, Delta CT, and RefFinder were adopted to identify the most stably expressed reference genes following infection by the different viruses. The results show that the most promising reference genes can even be different for viruses of the same genus and identify useful reference genes for future studies of gene expression following infection by these 11 viruses. Our analysis also revealed that the selection and validation of reference genes in the context of different viral infections are necessary.

Assessment of Primer Specificity and Amplification Efficiency
A total of 28 reference genes and their forward and reverse primers used for RT-qPCR were obtained from published literature. The primer sequences of all genes and information on the length of amplification products are provided in Table S1. We used RT-qPCR melting curve analysis, 1% agarose gel electrophoresis and ethidium bromide staining, and PCR product sequencing to verify the specificity of the primers. Among all the 28 selected candidate genes in Table S1, genes in black letters indicate that the melting curve obtained by RT-qPCR was single-peaked and only one band was visualized after ethidium bromide staining ( Figures 1A and S1), indicating high specificity of the amplification product; genes in red letters indicate the presence of double peaks in the melting curve or a red curve ( Figure 1B), revealing the presence of non-specific amplification for that primer or low transcript levels of the target genes; genes in blue letters indicate that, as revealed by Sanger sequencing of the RT-qPCR products after cloning into pGEM ® -T vector, the obtained sequence of at least three clones did not match the expected target gene sequences. We finally selected 13 genes from the 28 candidates for the subsequent experiments. The amplification efficiencies of these 13 primer pairs were analyzed. The results showed that the amplification efficiencies ranged from 91.9% to 110% (Table 1), which were all within the acceptable range of 90-110%. Furthermore, the standard curves demonstrated good linear relationships (R 2 > 0.9874) between the cycle threshold (Ct) values and the log-transformed copy numbers of 13 tested reference genes ( Table 1). The above results indicate that the primers of these 13 pairs of candidate reference genes were suitable for following RT-qPCR experiments.  Figure S2A. The data of RT-PCR confirmation of viral infection are shown in Figure S2B.

Confirmation of Viral Infection and Sampling
At 6 days after inoculation, plants infected with six viruses (TMV, ToBRFV, PMMoV, TMGMV, ToMMV, and TuMV) had obvious symptoms (curved leaves, yellow spotting, etc.) in the systemic leaves, whereas it took 15 days for plants infected with the other five viruses to show typical symptoms. Symptoms induced by infections of representative viruses from four genera are shown in Figure S2A. The data of RT-PCR confirmation of viral infection are shown in Figure S2B.

Expression Analysis of the 13 Selected Reference Genes by RT-qPCR
To analyze the differential expression of the 13 candidate reference genes in N. benthamiana under infection of each of the 11 viruses, RNA samples were prepared using leaf samples collected at either 6 or 15 dpi depending on the time symptoms appeared (see above). RNA integrity was analyzed by 5% urea polyacrylamide gel electrophoresis. After ethidium bromide staining, ribosomal RNA bands with sharp edges were observed, indicating high RNA integrity ( Figure S3). RNA concentrations were determined and only the ones with a 260/280 ratio close to 2.0 and a 260/230 ratio within the range of 2.0-2.2 were used in RT-qPCR. Prior to reverse transcription, genomic DNA was digested by incubating the RNA samples with DNase I for 2 h at 37 °C. Melting curves of the RT-qPCR products were checked to make sure only a single sharp peak was obtained. Ct values of

Expression Analysis of the 13 Selected Reference Genes by RT-qPCR
To analyze the differential expression of the 13 candidate reference genes in N. benthamiana under infection of each of the 11 viruses, RNA samples were prepared using leaf samples collected at either 6 or 15 dpi depending on the time symptoms appeared (see above). RNA integrity was analyzed by 5% urea polyacrylamide gel electrophoresis. After ethidium bromide staining, ribosomal RNA bands with sharp edges were observed, indicating high RNA integrity ( Figure S3). RNA concentrations were determined and only the ones with a 260/280 ratio close to 2.0 and a 260/230 ratio within the range of 2.0-2.2 were used in RT-qPCR. Prior to reverse transcription, genomic DNA was digested by incubating the RNA samples with DNase I for 2 h at 37 • C. Melting curves of the RT-qPCR products were checked to make sure only a single sharp peak was obtained. Ct values of three biological replicates of each virus-infected group and the Mock infection group were displayed by box line plots to represent the variations ( Figure 2). Among the four potyviruses TuMV, ChiRSV, ChiVMV, and PePMoV, the genes that varied the most in Ct values were GAPDH, AGO2, AGO2, and GADPH, respectively; the genes with the least variations in Ct values were RdR6, UBC, Lip, and Lip, respectively.
In the case of PVX infection (genus Potexvirus), the Ct value of gene Lip varied the most and the gene Tspan varies the least and for PoPeVYV (genus Polerovirus) the gene with the biggest variation in Ct value was L23 and the gene with the smallest variation was UBC.

Stability Analysis of the 13 Selected Reference Genes
To further evaluate the expression stability of the 13 candidate reference genes, four Among the four potyviruses TuMV, ChiRSV, ChiVMV, and PePMoV, the genes that varied the most in Ct values were GAPDH, AGO2, AGO2, and GADPH, respectively; the genes with the least variations in Ct values were RdR6, UBC, Lip, and Lip, respectively.
In the case of PVX infection (genus Potexvirus), the Ct value of gene Lip varied the most and the gene Tspan varies the least and for PoPeVYV (genus Polerovirus) the gene with the biggest variation in Ct value was L23 and the gene with the smallest variation was UBC.

Stability Analysis of the 13 Selected Reference Genes
To further evaluate the expression stability of the 13 candidate reference genes, four different algorithms (geNorm, NormFinder, BestKeeper, and Delta-CT) were used to calculate expression stabilities individually. Finally, a comprehensive analysis platform RefFinder was used to integrate the values obtained from the four algorithms to identify the one with the highest stability upon the infection of each virus.
As shown in Table 2, the different algorithms were not consistent with one another in identifying the most and least stable genes for each virus. When integrated in RefFinder (Table 3), the top four reference genes for TMV were Lip, eIF4A, EF1α, and L23 while GAPDH, AGO2, and ACT were the least stable; since the M value of GAPDH was greater than 1.5 it is very unsuitable as a reference in this regard. For PMMoV infection, RefFinder identified the top four stable reference genes as RdR6, UBC, Tspan, and ACT while the least stable were AGO2, PGK, and L23, none of which were suitable as a reference gene. For TMGMV, F-BOX, RdR6, Lip, and ACT were the top four stable reference genes while GAPDH, AGO2, UBC, and eIF4A were not suitable as reference genes. For ToBRFV, the top four stable reference genes were F-BOX, UBC, EF1α, and PGK, while AGO2, GAPDH, and L23 were the three least stable genes. For ToMMV, RdR6, Tspan, Lip, and eIF4A were the top four stable genes while the least stable were GAPDH, ACT, AGO2, and L23.  Among the potyviruses, RefFinder identified PGK, RdR6, eIF4A, and Tspan as the top four stable reference genes for TuMV, while the most unstable were GAPDH, AGO2, ACT, and UBC (Table 3). For ChiRSV, Tspan, UBC, EF1α, and RdR6 were the top four stable reference genes, while the most unstable were AGO2, L23, ACT, and GAPDH. For ChiVMV, Tspan, eIF4A, UBC, and F-BOX were the four most stable genes while the most unstable were AGO2, GAPDH, L23, and RdR6). For PePMoV, the four algorithms gave very similar results with Lip, eIF4A, KLC, and Tspan being the most stable (Table 2); the most unstable were AGO2, GAPDH, ACT, and RdR6 (Table 3).
For PVX, RefFinder identified F-BOX, Tspan, ACT, and PGK as the top four stable candidate reference genes, while the most unstable were Lip, KLC, EF1α, and RdR6. For PoPeVYV, the four most stable genes were Tspan, F-BOX, eIF4A, and PGK, with RdR6, AGO2, L23, and Lip being the most unstable.

Discussion
RT-qPCR has become one of the important methods for rapid, sensitive, and quantitative comparison of gene expression levels [25]. Many experimental factors, such as the quantity of starting material, the quality of RNA, and the efficiency of reverse transcriptase, can affect the efficacy of RT-qPCR [4,26]. Therefore, accurate normalization, which relies on the selection of housekeeping reference genes, is crucial. Unfortunately, there are no universal references that can be used under diverse conditions to accurately reflect alterations in gene expression [27]. In the case of different viral infections that can globally affect gene expression, it is particularly important to select and validate suitable internal references to identify the targets of viruses and potential anti-viral factors [28]. Therefore, a systematic evaluation of reference genes must be performed before the analysis of gene expression upon viral infections [29].
Interestingly, our comprehensive data show that different reference genes should be used for different viruses, even when those viruses are members of the same genus. For example, while Lip, eIF4A, EF1α, and L23 appeared to be most suitable as reference genes for infection by TMV, those most suitable for PMMoV (another tobamovirus) were RdR6, UBC, Tspan, and ACT. Thus, there is no single gene appropriate for use with all tobamoviruses. Similar conclusions were obtained after the analysis of gene expression in the context of infections by four different potyviruses. Our study therefore suggests that reference genes need to be evaluated and selected on a case-by-case basis in the context of viral infections; they cannot be simply adopted from previous studies using a healthy plant or a different virus.
Our conclusions are also supported by previous studies. F-BOX is stably expressed in a variety of plant species under different biotic or abiotic stresses. In Arabidopsis thaliana, F-BOX is one of the genes that exhibits the most stable expression levels when plants are infected by viruses [30] or when they are under metal stress [31]. Analysis of gene expression in N. benthamiana plants infected with five RNA plant viruses also revealed that the F-BOX was the most stable reference gene [10]. In our study, F-BOX was also the most suitable reference gene in N. benthamiana plants infected with TMGMV, ToBRFV, and PVX. However, F-BOX expression levels varied a lot in plants infected by TMV or TuMV. Tetraspanins are a family of small transmembrane proteins that play crucial roles in intracellular transport, cellular signal transduction, and cell migration [32]. More than 33 Tspans, such as Tspan7, span9, CD9, CD81, and CD151, are known to be involved in the process of virus infestation in human cells [33]. Tetraspanins in A. thaliana facilitate CMV infection through specific membrane-recognition motifs [34]. However, in our study Tspan (Tetraspanin-20) displayed stable expression in N. benthamiana plants infected by ToMMV, ChiRSV, ChiVMV or PVX, suggesting that Tspan plays no significant role in infection by these viruses. GAPDH, AGO2, and actins are three of the most widely used reference genes in various plant species [22,35], but they have altered expression under various biotic or abiotic stresses [26,30,36]. Consistently, our study also revealed the altered expression of GAPDH, AGO2, and actins upon viral infections, suggesting that these are not suitable as reference genes in studies of viral infection. Interestingly, a previous study reported that L25 and EF-1α can be used as stable reference genes for gene expression analysis in both developmental and stress-treated samples of N. benthamiana plants [9]. The complexity of gene expression analysis under biotic stress treatments, especially viral infections, was highlighted by that study and our results.

Plant Materials and Virus Inoculation
N. benthamiana plants were grown in a growth chamber with a 16-h light/8-h dark cycle at 24 • C and 50% relative humidity. Infectious clones of 11 viruses, including TMV, PMMoV, TMGMV, ToBRFV, ToMMV, TuMV, ChiRSV, ChiVMV, PePMoV, PVX, and PoPeVYV were all from our laboratory. The information of genome size, backbone vector, and promoter were listed in Table S2. All viruses were agro-inoculated into N. benthamiana at the five-true-leaf stage according to published methods [37]. Three biological replicates (three plants per replicate) were included for each virus and mock control. Viral infection was confirmed by disease symptoms.

Total RNA Isolation and cDNA Synthesis
Depending on the onset of symptoms (see Results Section 2.2, above), leaf samples were collected from the first two fully expanded systemic leaves of N. benthamiana plants infected with TMV, ToBRFV, PMMoV, TMGMV, ToMMV, and TuMV at 6 days post inoculation (dpi) and from those infected with the remaining five viruses and 15 dpi. Total RNA was extracted using TRIzol TM Reagent (Invitrogen, Shanghai, China) according to the manufacturer's instructions. RNA concentration was measured using a Nanodrop ONE spectrophotometer (Nanodrop Technologies, Rockland, DE, USA). RNA integrity was analyzed by 5% urea polyacrylamide gel electrophoresis, according to standard procedures [38]. First Strand cDNA was synthesized by reverse transcription (RT) using EasyScript ® . Allin-One First-Strand cDNA Synthesis SuperMix for qPCR (Transgen, Beijing, China) with 1 µg total RNA as template. All procedures were completed following the manufacturer's protocol. Thermal conditions for RT were: 15 min at 37 • C, and then 5 s at 85 • C to inactivate the enzyme.

Selection of Candidate Reference Genes
Reference genes and the forward and reverse primers were obtained from published literature [10,[39][40][41][42][43][44], or designed in this study. Primer sequences and the references are listed in Table S1. Primer specificity was evaluated using RT-qPCR and PCR amplification, in which the cDNA sample was derived from an RNA sample from a healthy N. benthamiana plant. The amplified PCR products were visualized by ethidium bromide staining after 1% agarose gel electrophoresis. PCR products were also purified and cloned into blunt zero T-vector (Transgen, Beijing, China) and sequenced. The amplification efficiency of each pair of primers was analyzed using the built-in software of the Roche LightCycler480 instrument and EXCEL. Standard curves were generated from a five-fold dilution (initial concentration = 14.6 ng/µL) series of one sample.

RT-PCR Confirmation of Viral Infection and RT-qPCR
RT-PCR was performed to confirm the existence of viral genome RNA in systemic leaves. Random primers were used to synthesize cDNA through RT. Primer sequences of PCR were listed in Table S3. RT-qPCR was performed on a LightCycler480 instrument (Roche, Switzerland) with 384-well PCR plates. Reaction mixtures were composed of 18 µL of ChamQ Universal YBR QPCR Master Mix (Vazyme, Nanjing, China), 0.6 µL of each primer (10 µM), 6 µL of cDNA template, and 10.8 µL of RNase free water. Three biological replicates were included for each experiment, and three technical replicates were set for each biological replicate. Thermal cycles of RT-qPCR were: 95 • C for 30 s, then 40 cycles of 15 s at 95 • C, 15 s at 55 • C, and 30 s at 72 • C. Melting curves were prepared through the following thermal conditions: 95 • C for 10 s, 60 • C for 30 s, and 95 • C for 15 s. Amplification efficiencies were calculated using the built-in software of the Roche LightCycler 480 instrument.
For geNorm and NormFinder, the raw Ct values were converted to the data input format (2 −∆Ct ) required by the software. Moreover, ∆Ct = each Ct value-the lowest Ct value. In contrast, raw Ct values were directly used as input data in BestKeeper and RefFinder analysis. The output of geNorm software analysis is the expression stability value M, which represents the average pairwise variation between a single gene and other internal reference genes. With M = 1.5 as the cutoff value, a low M value indicates stable gene expression. In addition, geNorm also provides pairwise variation values (Vn/n+1) to represent the optimal number of reference genes for normalization. When Vn/n+1 < 0.15, the most suitable number of reference genes was n. When Vn/n+1 > 0.15, the most appropriate number of reference genes is n + 1. Normfinder provides the stability value SV to evaluate the differences in the expression of candidate reference genes and the variations among sample groups. Like the M value, a low SV value indicates stable gene expression. The Delta-CT method identifies the potential reference genes by comparing the relative expression between gene pairs. According to the lowest coefficient of variation (CV) and the lowest relative standard deviation (SD), BestKeeper software was used to determine the most stable reference genes. When SD > 1, the expression of the reference gene was unstable. RefFinder is a web-based comprehensive reference gene analysis platform. It integrates the values obtained using algorithms geNorm, NormFinder, BestKeeper, and Delta-CT to obtain the optimal internal reference genes.

Conclusions
In conclusion, our study identified optimal RT-qPCR reference genes for gene expression analysis in N. benthamiana plants upon single infections by 11 ss(+) RNA viruses from four genera. Our data also highlighted the importance of the selection and validation of reference genes for gene expression analysis in the context of viral infections.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/plants12040857/s1, Figure S1: Analysis of RT-qPCR product of 28 genes through 1% agarose gel electrophoresis and ethidium bromide staining; Figure S2: Symptoms induced by infections of 11 viruses from four genera (A) and RT-PCR confirmation of viral infection (B); Figure S3: Analysis of RNA integrity through 5% urea-PAGE gel electrophoresis and ethidium bromide staining; Table S1: Information of the 28 selected potential RT-qPCR reference genes and primer sequences; Table S2: Information of expression vectors of the 11 viruses used in this study; Table S3: Sequences of primers used in RT-PCR to confirm viral infection.